Optimizing disambiguation in Swahili

نویسنده

  • Arvi Hurskainen
چکیده

It is argued in this paper that an optimal solution to disambiguation is a combination of linguistically motivated rules and resolution based on probability or heuristic rules. By disambiguation is here meant ambiguity resolution on all levels of language analysis, including morphology and semantics. The discussion is based on Swahili, for which a comprehensive analysis system has been developed by using two-level description in morphology and constraint grammar formalism in disambiguation. Particular attention is paid to optimising the use of different solutions for achieving maximal precision with minimal rule writing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Expanding a multilingual media monitoring and information extraction tool to a new language: Swahili

The Europe Media Monitor (EMM) family of applications is a set of multilingual tools that gather, cluster and classify news in currently fifty languages and that extract named entities and quotations (reported speech) from twenty languages. In this paper, we describe the recent effort of adding the African Bantu language Swahili to EMM. EMM is designed in an entirely modular way, allowing plugg...

متن کامل

Disambiguation of morphological analysis in Bantu languages

The paper describes problems in disambiguating the morphological analysis of Bantu languages by using Swahili as a test language. The main factors of ambiguity in this language group can be traced to the noun class structure on one hand and to the bi-directional word-formation on the other. In analyzing word-forms, the system applied utilizes SWATWOL, a morphological parsing program based on tw...

متن کامل

Applying Finite-State Methods to the Swahili Language

Herein, we explore the current finite-state methods that exist for analyzing English grammar and decide whether they can be applied to the Swahili language and Swahili syntactic patterns. Further, we to explore the differences between Swahili grammar and English grammar to see if it is possible to accommodate these finite-state methods to the Swahili language. In the end, the objective is to de...

متن کامل

Optimizing Classifier Performance in Word Sense Disambiguation by Redefining Sense Classes

Learning word sense classes has been shown to be useful in fine-grained word sense disambiguation [Kohomban and Lee, 2005]. However, the common choice for sense classes, WordNet lexicographer files, are not designed for machine learning based word sense disambiguation. In this work, we explore the use of clustering techniques in an effort to construct sense classes that are more suitable for wo...

متن کامل

Characteristics of Swahili–English bilingual agrammatic spontaneous speech and the consequences for understanding agrammatic aphasia

Most studies on spontaneous speech of individuals with agrammatism have focused almost exclusively on monolingual individuals. There is hardly any previous research on bilinguals, especially of structurally different languages; and none on characterization of agrammatism in Swahili. The current study identifies the features of Swahili agrammatic narrative and spontaneous speech, and compares th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004